Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
EBioMedicine ; 102: 105075, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38565004

RESUMO

BACKGROUND: AI models have shown promise in performing many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust of doctors in AI-based models, especially in domains where AI prediction capabilities surpass those of humans. Moreover, such explanations could enable novel scientific discovery by uncovering signals in the data that aren't yet known to experts. METHODS: In this paper, we present a workflow for generating hypotheses to understand which visual signals in images are correlated with a classification model's predictions for a given task. This approach leverages an automatic visual explanation algorithm followed by interdisciplinary expert review. We propose the following 4 steps: (i) Train a classifier to perform a given task to assess whether the imagery indeed contains signals relevant to the task; (ii) Train a StyleGAN-based image generator with an architecture that enables guidance by the classifier ("StylEx"); (iii) Automatically detect, extract, and visualize the top visual attributes that the classifier is sensitive towards. For visualization, we independently modify each of these attributes to generate counterfactual visualizations for a set of images (i.e., what the image would look like with the attribute increased or decreased); (iv) Formulate hypotheses for the underlying mechanisms, to stimulate future research. Specifically, present the discovered attributes and corresponding counterfactual visualizations to an interdisciplinary panel of experts so that hypotheses can account for social and structural determinants of health (e.g., whether the attributes correspond to known patho-physiological or socio-cultural phenomena, or could be novel discoveries). FINDINGS: To demonstrate the broad applicability of our approach, we present results on eight prediction tasks across three medical imaging modalities-retinal fundus photographs, external eye photographs, and chest radiographs. We showcase examples where many of the automatically-learned attributes clearly capture clinically known features (e.g., types of cataract, enlarged heart), and demonstrate automatically-learned confounders that arise from factors beyond physiological mechanisms (e.g., chest X-ray underexposure is correlated with the classifier predicting abnormality, and eye makeup is correlated with the classifier predicting low hemoglobin levels). We further show that our method reveals a number of physiologically plausible, previously-unknown attributes based on the literature (e.g., differences in the fundus associated with self-reported sex, which were previously unknown). INTERPRETATION: Our approach enables hypotheses generation via attribute visualizations and has the potential to enable researchers to better understand, improve their assessment, and extract new knowledge from AI-based models, as well as debug and design better datasets. Though not designed to infer causality, importantly, we highlight that attributes generated by our framework can capture phenomena beyond physiology or pathophysiology, reflecting the real world nature of healthcare delivery and socio-cultural factors, and hence interdisciplinary perspectives are critical in these investigations. Finally, we will release code to help researchers train their own StylEx models and analyze their predictive tasks of interest, and use the methodology presented in this paper for responsible interpretation of the revealed attributes. FUNDING: Google.


Assuntos
Algoritmos , Catarata , Humanos , Cardiomegalia , Fundo de Olho , Inteligência Artificial
2.
Neuroimage ; 254: 119121, 2022 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-35342004

RESUMO

Reconstructing natural images and decoding their semantic category from fMRI brain recordings is challenging. Acquiring sufficient pairs of images and their corresponding fMRI responses, which span the huge space of natural images, is prohibitive. We present a novel self-supervised approach that goes well beyond the scarce paired data, for achieving both: (i) state-of-the art fMRI-to-image reconstruction, and (ii) first-ever large-scale semantic classification from fMRI responses. By imposing cycle consistency between a pair of deep neural networks (from image-to-fMRI & from fMRI-to-image), we train our image reconstruction network on a large number of "unpaired" natural images (images without fMRI recordings) from many novel semantic categories. This enables to adapt our reconstruction network to a very rich semantic coverage without requiring any explicit semantic supervision. Specifically, we find that combining our self-supervised training with high-level perceptual losses, gives rise to new reconstruction & classification capabilities. In particular, this perceptual training enables to classify well fMRIs of never-before-seen semantic classes, without requiring any class labels during training. This gives rise to: (i) Unprecedented image-reconstruction from fMRI of never-before-seen images (evaluated by image metrics and human testing), and (ii) Large-scale semantic classification of categories that were never-before-seen during network training. Such large-scale (1000-way) semantic classification from fMRI recordings has never been demonstrated before. Finally, we provide evidence for the biological consistency of our learned model.


Assuntos
Redes Neurais de Computação , Semântica , Encéfalo/diagnóstico por imagem , Humanos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos
3.
Nat Commun ; 10(1): 4934, 2019 10 30.
Artigo em Inglês | MEDLINE | ID: mdl-31666525

RESUMO

The discovery that deep convolutional neural networks (DCNNs) achieve human performance in realistic tasks offers fresh opportunities for linking neuronal tuning properties to such tasks. Here we show that the face-space geometry, revealed through pair-wise activation similarities of face-selective neuronal groups recorded intracranially in 33 patients, significantly matches that of a DCNN having human-level face recognition capabilities. This convergent evolution of pattern similarities across biological and artificial networks highlights the significance of face-space geometry in face perception. Furthermore, the nature of the neuronal to DCNN match suggests a role of human face areas in pictorial aspects of face perception. First, the match was confined to intermediate DCNN layers. Second, presenting identity-preserving image manipulations to the DCNN abolished its correlation to neuronal responses. Finally, DCNN units matching human neuronal group tuning displayed view-point selective receptive fields. Our results demonstrate the importance of face-space geometry in the pictorial aspects of human face perception.


Assuntos
Córtex Cerebral/fisiologia , Reconhecimento Facial/fisiologia , Interpretação de Imagem Assistida por Computador , Redes Neurais de Computação , Neurônios/fisiologia , Adulto , Eletroencefalografia , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
4.
Front Comput Neurosci ; 12: 57, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30087604

RESUMO

Visual perception involves continuously choosing the most prominent inputs while suppressing others. Neuroscientists induce visual competitions in various ways to study why and how the brain makes choices of what to perceive. Recently deep neural networks (DNNs) have been used as models of the ventral stream of the visual system, due to similarities in both accuracy and hierarchy of feature representation. In this study we created non-dynamic visual competitions for humans by briefly presenting mixtures of two images. We then tested feed-forward DNNs with similar mixtures and examined their behavior. We found that both humans and DNNs tend to perceive only one image when presented with a mixture of two. We revealed image parameters which predict this perceptual dominance and compared their predictability for the two visual systems. Our findings can be used to both improve DNNs as models, as well as potentially improve their performance by imitating biological behaviors.

5.
IEEE Trans Pattern Anal Mach Intell ; 36(6): 1092-106, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-26353273

RESUMO

We define a "good image cluster" as one in which images can be easily composed (like a puzzle) using pieces from each other, while are difficult to compose from images outside the cluster. The larger and more statistically significant the pieces are, the stronger the affinity between the images. This gives rise to unsupervised discovery of very challenging image categories. We further show how multiple images can be composed from each other simultaneously and efficiently using a collaborative randomized search algorithm. This collaborative process exploits the "wisdom of crowds of images", to obtain a sparse yet meaningful set of image affinities, and in time which is almost linear in the size of the image collection. "Clustering-by-Composition" yields state-of-the-art results on current benchmark data sets. It further yields promising results on new challenging data sets, such as data sets with very few images (where a `cluster model' cannot be `learned' by current methods), and a subset of the PASCAL VOC data set (with huge variability in scale and appearance).

6.
Magn Reson Med ; 63(6): 1594-600, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20512863

RESUMO

Single-scan MRI underlies a wide variety of clinical and research activities, including functional and diffusion studies. Most common among these "ultrafast" MRI approaches is echo-planar imaging. Notwithstanding its proven success, echo-planar imaging still faces a number of limitations, particularly as a result of susceptibility heterogeneities and of chemical shift effects that can become acute at high fields. The present study explores a new approach for acquiring multidimensional MR images in a single scan, which possesses a higher built-in immunity to this kind of heterogeneity while retaining echo-planar imaging's temporal and spatial performances. This new protocol combines a novel approach to multidimensional spectroscopy, based on the spatial encoding of the spin interactions, with image reconstruction algorithms based on super-resolution principles. Single-scan two-dimensional MRI examples of the performance improvements provided by the resulting imaging protocol are illustrated using phantom-based and in vivo experiments.


Assuntos
Algoritmos , Imageamento por Ressonância Magnética/métodos , Imagens de Fantasmas , Animais , Camundongos
7.
IEEE Trans Pattern Anal Mach Intell ; 29(12): 2247-53, 2007 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-17934233

RESUMO

Human action in video sequences can be seen as silhouettes of a moving torso and protruding limbs undergoing articulated motion. We regard human actions as three-dimensional shapes induced by the silhouettes in the space-time volume. We adopt a recent approach for analyzing 2D shapes and generalize it to deal with volumetric space-time action shapes. Our method utilizes properties of the solution to the Poisson equation to extract space-time features such as local space-time saliency, action dynamics, shape structure and orientation. We show that these features are useful for action recognition, detection and clustering. The method is fast, does not require video alignment and is applicable in (but not limited to) many scenarios where the background is known. Moreover, we demonstrate the robustness of our method to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action, and low quality video.


Assuntos
Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Modelos Biológicos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Imagem Corporal Total/métodos , Algoritmos , Simulação por Computador , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
8.
IEEE Trans Pattern Anal Mach Intell ; 29(11): 2045-56, 2007 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-17848783

RESUMO

We introduce a behavior-based similarity measure which tells us whether two different space-time intensity patterns of two different video segments could have resulted from a similar underlying motion field. This is done directly from the intensity information, without explicitly computing the underlying motions. Such a measure allows us to detect similarity between video segments of differently dressed people performing the same type of activity. It requires no foreground/background segmentation, no prior learning of activities, and no motion estimation or tracking. Using this behavior-based similarity measure, we extend the notion of 2-dimensional image correlation into the 3-dimensional space-time volume, thus allowing to correlate dynamic behaviors and actions. Small space-time video segments (small video clips) are "correlated" against entire video sequences in all three dimensions (x,y, and t). Peak correlation values correspond to video locations with similar dynamic behaviors. Our approach can detect very complex behaviors in video sequences (e.g., ballet movements, pool dives, running water), even when multiple complex activities occur simultaneously within the field-of-view of the camera. We further show its robustness to small changes in scale and orientation of the correlated behavior.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Gravação em Vídeo/métodos , Análise Discriminante , Aumento da Imagem/métodos , Movimento (Física) , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Estatística como Assunto
9.
IEEE Trans Pattern Anal Mach Intell ; 29(3): 463-76, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17224616

RESUMO

This paper presents a new framework for the completion of missing information based on local structures. It poses the task of completion as a global optimization problem with a well-defined objective function and derives a new algorithm to optimize it. Missing values are constrained to form coherent structures with respect to reference examples. We apply this method to space-time completion of large space-time "holes" in video sequences of complex dynamic scenes. The missing portions are filled in by sampling spatio-temporal patches from the available parts of the video, while enforcing global spatio-temporal consistency between all patches in and around the hole. The consistent completion of static scene parts simultaneously with dynamic behaviors leads to realistic looking video sequences and images. Space-time video completion is useful for a variety of tasks, including, but not limited to: 1) Sophisticated video removal (of undesired static or dynamic objects) by completing the appropriate static or dynamic background information. 2) Correction of missing/corrupted video frames in old movies. 3) Modifying a visual story by replacing unwanted elements. 4) Creation of video textures by extending smaller ones. 5) Creation of complete field-of-view stabilized video. 6) As images are one-frame videos, we apply the method to this special case as well.


Assuntos
Algoritmos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Reconhecimento Automatizado de Padrão/métodos , Gravação em Vídeo/métodos , Artefatos , Análise por Conglomerados , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
10.
IEEE Trans Pattern Anal Mach Intell ; 28(9): 1530-5, 2006 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-16929739

RESUMO

Real-world action recognition applications require the development of systems which are fast, can handle a large variety of actions without a priori knowledge of the type of actions, need a minimal number of parameters, and necessitate as short as possible learning stage. In this paper, we suggest such an approach. We regard dynamic activities as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences which captures the similarities in their behavioral content. This measure is nonparametric and can thus handle a wide range of complex dynamic actions. Having a behavior-based distance measure between sequences, we use it for a variety of tasks, including: video indexing, temporal segmentation, and action-based video clustering. These tasks are performed without prior knowledge of the types of actions, their models, or their temporal extents.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Movimento/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Gravação em Vídeo/métodos , Caminhada/fisiologia , Humanos , Aumento da Imagem/métodos , Armazenamento e Recuperação da Informação/métodos , Cinética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...